In this lab, you’ll use the techniques we studied this week to perform two analyses that have application in a military/DoD context:
A kernel density analysis of criminal events (IED attacks, ambushes, etc.) in order to build predictive “threat maps” that illustrate the expected locations of future enemy attacks (a frequently applied technique in the intelligence and combat engineer domains). We will use some open source datasets as a proxy for data seen in these military problems.
Before beginning the lab below, use the RStudio IDE file management pane (lower right quadrant) to upload the Lab3.zip file onto your Rstudio instance.
Kernel Density Estimation for Building Threat Maps The zip folder you were provided contains two shapefiles:
The shapefile Burglaries2008.shp contains more than 3000 burglaries committed in the city of Pittsburgh during the year 2008
The shapefile PolicePrecincts.shp contains a polygon shape layer for the Police Precincts in the City of Pittsburgh
We will use these shapefiles as a proxy for the problem in a military context of developing an estimate for the underlying pattern of enemy attacks in an area of interest.
Burglaries<-read_sf(getwd(),"Burglaries2008")
Precincts<-read_sf(getwd(), "PolicePrecincts")
BurglariesTrain <- Burglaries[1:100,]
BurglariesTest <- Burglaries[101:200,]
resultsBur.train <- st_as_sf(BurglariesTrain, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.train.proj<-st_transform(resultsBur.train,
crs = 2278)
resultsBur.test <- st_as_sf(BurglariesTest, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.test.proj<-st_transform(resultsBur.test,
crs = 2278)
BURG_dens<-qgis_run_algorithm(algorithm ="qgis:heatmapkerneldensityestimation",
INPUT=resultsBur.train.proj,
RADIUS = 5000,
PIXEL_SIZE = 100,
KERNEL = 0,
OUTPUT=file.path(getwd(), "burgdenst.TIF"),
load_output = TRUE)
## Ignoring unknown input 'load_output'
## Argument `RADIUS_FIELD` is unspecified (using QGIS default value).
## Argument `WEIGHT_FIELD` is unspecified (using QGIS default value).
## Argument `DECAY` is unspecified (using QGIS default value).
## Using `OUTPUT_VALUE = "Raw"`
result<- qgis_as_raster(BURG_dens)
projection(result)<-crs(resultsBur.train.proj)
mapview(result)+mapview(resultsBur.test.proj)
## Warning in validateCoords(lng, lat, funcName): Data contains 6 rows with either
## missing or invalid lat/lon values and will be ignored
As there are an equal number of observations in the underlying heat map and the test projection, the fit is off in some areas. Likely, it will start to get better when we get to more of an 80/20 ratio.
BurglariesTrain <- Burglaries[1:200,]
BurglariesTest <- Burglaries[201:300,]
resultsBur.train <- st_as_sf(BurglariesTrain, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.train.proj<-st_transform(resultsBur.train,
crs = 2278)
resultsBur.test <- st_as_sf(BurglariesTest, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.test.proj<-st_transform(resultsBur.test,
crs = 2278)
BURG_dens<-qgis_run_algorithm(algorithm ="qgis:heatmapkerneldensityestimation",
INPUT=resultsBur.train.proj,
RADIUS = 5000,
PIXEL_SIZE = 100,
KERNEL = 0,
OUTPUT=file.path(getwd(), "burgdenst.TIF"),
load_output = TRUE)
## Ignoring unknown input 'load_output'
## Argument `RADIUS_FIELD` is unspecified (using QGIS default value).
## Argument `WEIGHT_FIELD` is unspecified (using QGIS default value).
## Argument `DECAY` is unspecified (using QGIS default value).
## Using `OUTPUT_VALUE = "Raw"`
result<- qgis_as_raster(BURG_dens)
projection(result)<-crs(resultsBur.train.proj)
mapview(result)+mapview(resultsBur.test.proj)
## Warning in validateCoords(lng, lat, funcName): Data contains 4 rows with either
## missing or invalid lat/lon values and will be ignored
Here we see even at a 66/33 ratio, the fit is starting to get pretty good. We still have a few observations that are outside the heat map or areas in the heat map with no test observations projected over it but this fit looks much better than the previous 50/50 split.
BurglariesTrain <- Burglaries[1:1000,]
BurglariesTest <- Burglaries[1001:1500,]
resultsBur.train <- st_as_sf(BurglariesTrain, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.train.proj<-st_transform(resultsBur.train,
crs = 2278)
resultsBur.test <- st_as_sf(BurglariesTest, coords=c("Longitude", "Latitude"), crs=4269,agr="constant")
resultsBur.test.proj<-st_transform(resultsBur.test,
crs = 2278)
BURG_dens<-qgis_run_algorithm(algorithm ="qgis:heatmapkerneldensityestimation",
INPUT=resultsBur.train.proj,
RADIUS = 5000,
PIXEL_SIZE = 100,
KERNEL = 0,
OUTPUT=file.path(getwd(), "burgdenst.TIF"),
load_output = TRUE)
## Ignoring unknown input 'load_output'
## Argument `RADIUS_FIELD` is unspecified (using QGIS default value).
## Argument `WEIGHT_FIELD` is unspecified (using QGIS default value).
## Argument `DECAY` is unspecified (using QGIS default value).
## Using `OUTPUT_VALUE = "Raw"`
result<- qgis_as_raster(BURG_dens)
projection(result)<-crs(resultsBur.train.proj)
mapview(result)+mapview(resultsBur.test.proj)
## Warning in validateCoords(lng, lat, funcName): Data contains 37 rows with
## either missing or invalid lat/lon values and will be ignored
Here we are still at a 66/33 split between heat map data and the projected test data but with 5x the data. Due to the increase in number of observations, we can see that the underlying distribution is starting to get well mapped.However, we are losing some granularity that we had in the previous 300 observation visualization. This is a trade off that we would need to deliberately evaluate as we decide on the appropriate number of observations to fit the underlying distribution. As this is time series data, we would likely want to set the threshold of the number of previous burglaries to fit the heat map to predict future burglary hot spots.